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FOREWORD 


The  study  upon  which  this  report  is  based  was  conducted  in  support 
of  Project  2806,  Task  280609,  during  the  period  of  November,  1963 
through  May,  1965. 

This  report  consists  of  a  paper  presented  as  part  of  the  symposium, 
"Current  Trends  in  Computer-Based  Instructional  Systems,  "  at  the 
National  Society  for  Programed  Instruction  convention  in  Philadelphia, 
Pennsylvania  on  8  May  1965, 

In  part,  it  describes  a  computer-based  subjective  probability  response 
technique  developed  by  this  Laboratory.  Based  upon  the  mathematical 
concepts  defined  by  T^a  (in  ESD-TDR-63-407),  the  prototype  design  of 
this  measurement  technique  was  created  Jointly  by  the  author  and  W.  E. 
Organist.  The  technique  shows  promise  of  getting  more  information 
per  response  for  use  in  computer-assisted  Instruction,  testing,  and 
psychological  experimentation. 

Subsequent  to  this  report,  this  measurement  approach  evolved  into 
a  system  that  serviced  four  subject  stations  at  the  same  time  and  has 
been  used  in  experiments  which  will  be  reported  on  separately. 

Robert  T.  Rizzo,  of  the  Arcon  Corporation,  was  responsible  for  the 
computer  program  design  and  programming  of  the  prototype,  James  D. 

Baker  and  Ira  Goldstein  contributed  to  the  final  design  and  Implementation. 


This  Technical  Report  has  been  re^ 
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ABSTRACT 


This  report  presents  a  concrete  reallzatlcn  of  an  admissible 
probability  measurement  procedure  utilizing  a  computer-driven 
scope  and  light  pen.  This  particular  technique  is  appropriate 
for  all  multiple-choice  type  of  testing. 

Empirical  results  are  reported  from  an  analogous  pencll-and- 
paper  realization  of  the  same  admissible  probability  measurement 
procedure.  These  results  Indicate  a  marked  superiority  for 
admissible  probability  measurement  over  traditional  multiple- 
choice  testing. 

It  is  suggested  that  further  gains  can  be  obtained  by  using 
admissible  probability  measurement  procedures  to  sequentially 
test  the  scope  of  knowledge  of  a  student. 
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CYBERNETIC  TESTING 


Emir  II.  Shuford,  Jr. 

A  computer  is  essentially  a  factory  for  the  very  rapid  processing  of 
information.  Cotrq>uters  can  be  used  effectively  to  reduce  the  cost  of 
information  processing  whenever  the  speed  of  the  con^uter  can  be  applied 
to  an  information  processing  problem  which  is  of  a  very  highly  repetitive 
nature.  This  allcws  the  cost  of  programming  to  be  amortized  over  many 
instances  of  application,  each  Justifying  a  part  of  the  total  cost. 

Most  applications  of  computers  have  Involved  just  the  substitution 
of  automatic  informatim  processing  for  some  part  of  a  more  complex, 
already  existing  enterprise.  Such  a  direct  substitution  can  dramatically 
reduce  the  cost  and  increase  the  capacity  for  information  processing,  but 
the  full  potential  of  this  change  can  generally  not  be  realized  unless 
other  changes  are  made  in  the  operatlcxi  of  the  enterprise.  For  example, 
it  is  sometimes  necessary  to  reduce  also  the  cost  of  obtaining  and  reacting 
to  information  by  introducing  techniques  for  the  automatic  sensing  of  and 
responding  to  information.  This  type  of  effective  application  of  a  computer 
is  well  represented  by  the  computer-based  instructional  systems  just 
described  by  Professors  Hansen  and  Stolurcw. 

In  these  applications,  the  computer  systems  (a)  measure  the  current 
state  of  the  student's  knowledge,  (b)  process  this  information  to  determine 
what  instructional  material  must  be  pres-mted  next  in  order  to  improve  the 
student's  knwledge,  and  (c)  effect  the  presentation  of  the  material. 

This  is  quite  clearly  a  cybernetic  control  process  with  the  computer  used 
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as  a  controller  which  senses  the  state  of  a  controlled  system  and  then 
takes  corrective  action  to  move  the  controlled  system  to  a  more  desirable 
state.  Notice,  however,  that  complete  automation  is  not  essential  to  the 
nature  of  the  cybernetic  process .  A  teacher  conducting  a  course  in  a 
classroom,  a  school  system  promoting  students  to  the  next  grade  level, 
and  a  student  guiding  the  course  of  his  own  study  are  also  examples  of 
the  cybernetic  control  process  as  applied  to  the  development  of  knowledge. 
(Shuford  &  Massengill,  1965). 

Now,  when  the  educational  process  is  locked  at  from  this  point  of 
view,  it  is  not  difficult  to  see  that  the  effectiveness  of  the  educational 
process  depends,  in  part,  upon  hew  well  we  can  observe  the  present  state 
of  the  student's  knoif ledge.  This  observational  process,  in  turn,  determines 
the  sensitivity  with  which  we  can  follow  the  educational  development  of 
the  student.  Indeed,  the  recent  emergence  of  admissible  probability 
measurement  procedures  (Shuford,  Albert,  &  Massengill,  1965)  whidi  yield 
mudi  more  information  about  the  current  state  of  a  student's  knowledge 
than  do  the  multiple-choice  and  constructed-response  test  procedures 
(Massengill  &  Shuford,  1965),  suggests  that  it  may  be  possible  to  achieve 
even  greater  increases  in  effectiveness  over  and  above  that  resulting 
solely  from  the  Introduction  of  computer-based  instructional  systems  based 
on  traditional  measurement  techniques. 

In  order  to  distinguish  these  new  applications  based  on  probability 
measurement  from  the  other  currently  used  applications  based  on  choice 
procedures,  I  would  like  to  introduce  two  new  terms.  First,  cybernetic 
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Instruction  refers  to  any  computer-based  instructional  system  utilizing 
probability  measurement  to  follow  the  development  of  a  student’s  knowledge. 
Second,  cybernetic  testing  refers  to  the  use  of  probability  measurement 
where  the  computer  may  be  used  to  control  the  testing  or  to  analyze  the 
results,  but  not  to  control  the  complete  course  of  instruction.  Thus, 
cybernetic  testing  may  be  used  to  aid  any  instructional  procedure  or  in 
association  with  any  instructional  media. 

Now,  what  is  an  admissible  probability  measurement  procedure?  First, 
let  me  define  it  and  then  we  will  get  down  to  cases.  An  admissible 
probability  measurement  procedure  has  a  scoring  system  which  guarantees 
that  any  student,  at  whatever  level  of  knowledge  or  skill,  can  maximize 
his  expected  score  if  and  only  if  he  follows  instructions  and  honestly 
reflects  his  degree-of-belief  probability  as  to  the  correctness  of  each 
possible  answer  to  the  test  item.  These  degree-of-belief  probabilities 
contain  all  of  the  information  that  can  be  made  available  about  the 
student's  knowledge  structure  as  a  consequence  of  asking  the  particular 
question  under  consideration.  By  way  of  contrast,  multiple-choice  and 
constructed- response  test  procedures  can  yield  only  partial  information 
as  to  whether  or  not  these  probabilities  exceed  certain  values  or  lie 
within  a  very  broad  range.  It  is  probably  best  at  this  point  to 
consider  a  concrete  example  of  an  admissible  probability  measurement 
procedure  used  in  conjunction  with  a  computer-based  system,  i.e.,  cybernetic 
testing.  Let's  look  at  some  pictures  which  illustrate  multiple-choice 
testing  on  a  con5)uter-driven  scope  and  light  pen. 
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Figure  1  shows  a  student  seated  before  a  computer  driven  scope 
and  light  pen  ready  to  begin  taking  the  multiple-choice  test. 


Figure  2  shows  the  first  question  on  the  test.  The  student  has  read 
the  qiiestlon  and  feels  that  she  is  ready  to  answer  it.  So,  she  points 
the  light  pen  at  the  CONTINUE  sign. 


Figure  3  shews  the  four  mutually  exclusive  and  exhaustive  possible 
answers  to  Question  1.  The  horizontal  line  by  each  answer  represents 
the  probability  currently  assigned  to  the  correctness  of  that  answer. 

The  number  to  the  left  of  the  line  represents  the  score  that  the  student 
would  receive  if,  in  fact,  that  answer  were  correct.  The  score  ranges 
between  zero  and  one  Instead  of  being  limited  to  Just  the  extreme  values 
of  zero  and  one  as  is  the  current  scoring  practice. 


Figure  4  shows  the  student  pointing  the  light  pen  to  adjust  the  probabilities 
•and  possible  scores.  The  student  has  no  doubt  that  the  correct  answer  Is  the 
*'Mean  Value  Tneorem'*  so  she  points  the  ll^t  pen  at  the  INC  sign.  The  prob¬ 
ability  assigned  to  the  "Mean  Value  Theorem"  Increases  at  a  constant  rate  while 
total  probability  is  conserved  by  the  automatic  decrease  of  the  remaining 
probabilities.  Now  the  student  will  receive  a  score  of  1.0  if  this  is  the 
correct  answer,  but  nothing  if  any  of  the  other  answers  is  correct. 


Figure  5  shows  the  student  pointing  the  light  pen  at  CONTINUE.  She  is 
satisfied  with  her  probability  assignment  and  wants  to  find  out  how  well 
she  scored. 


Figure  6  shows  that  the  "Mean  Value  Iheorem"  is  the  correct  answer  to  the 
first  question,  that  the  student  received  a  score  of  1.0  on  the  question, 
and  that  her  total  score  to  this  point  in  the  test  is  1.0. 


Figure  7  shows  the  student  pointing  at  CONTINUE.  This  will  cause  the 
next  question  to  be  displayed. 


Figure  8  shows  the  second  question.  The  student  has  read  the  question 
and  points  to  CONTINUE. 
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Figure  9  shov/s  the  four  possible  answers  to  the  second  question. 
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Figure  10  shows  the  student  pointing  at  the  DEC  sign  associated  with 
"Abraham  Lincoln."  After  reading  the  four  answers,  she  is  quite  certain 
that  "Abraham  Lincoln"  is  not  correct  so  she  points  at  the  DEC  sign 
until  the  probability  associated  with  that  answer  is  reduced  to  zero. 


Fi gure  1 1  shews  the  student  pointing  at  the  GO  BACK  sign*  She  would  like 
to  review  the  question  and  does  so  by  pointing  at  the  GO  BACK  sign* 


Figure  12  shows  the  sec<md  question  being  displayed  again.  After  the 
student  has  read  it,  she  will  point  at  CONTINUE.  ‘ 


Figure  13  shows  the  response  frame  of  the  second  question  being 
redisplayed.  Note  that  the  frame  appears  exactly  as  it  did  when 
the  student  pointed  at  the  GO  BACK  sign. 


Figure  14  shows  the  student  pointing  at  the  DEC  sign  associated  with  "John 
Adams."  She  has  decided  that  the  fourtli  answer,  "John  Adams,"  is  certainly 
not  the  correct  one  and  so  reduces  the  probability  assigned  to  this  answer 
to  zero. 


Figure  15  shows  the  student  pointing  at  the  INC  sign  beside  "Warren  Harding. 
She  is  sure  that  neither  the  first  nor  the  fourth  answers  are  correct,  but 
she  is  not  completely  certain  which  of  the  remaining  two  answers  is  correct. 
She  is,  however,  fairly  certain  that  the  second  answer,  'Varren  Harding," 
is  the  correct  one,  so  she  points  at  the  INC  sign  to  divide  the  probability 
between  these  two  answers  to  reflect  this  feeling.  Notice  that  she  does  not 
feel  that  she  can  exclude  the  third  answer,  "Benjamin  Harrises. " 


Figure  17  shews  that  "Warren  Harding"  is  the  correct  answer  to  the 
second  question,  that  the  student  received  a  score  of  .96  on  this 
question,  and  that  no4/  her  total  test  score  is  1.96.  When  the  student 
points  at  CONTINUE,  she  will  move  on  to  the  next  item,  and  so  on. 
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Though  these  pictures  have  Illustrated  admissible  probability 
measurement  only  for  multiple-choice  type  items,  it  is  important  to 
note  that  procedures  exist  for  having  the  student  supply  his  own  answer 
(Shuford,  Albert,  &  Massenglll,  1965),  Thus,  it  is  new  possible  to 
measure  a  student's  degree-of-belief  probabilities  for  almost  all 
objective  test  and  programmed  instructional  material.  Realize  that  no 
information  is  lost  by  substituting  admissible  probability  measurement 
procedures  for  tlie  choice  procedures  currently  in  use  since  a  student's 
choices  can  be  reconstructed  from  knowledge  of  his  probabilities,  i.e., 
the  student  would  be  expected  to  choose  the  most  likely  answer  if  given 
the  opportunity  (Shuford  &  Massenglll,  1965) . 

The  guarantee  that  no  information  is  lost  would  be  sufficient  to 
justify  the  use  of  admissible  probability  measurement  procedures  and 
high-speed  digital  computers  only  if  this  substitution  were  a  cheaper 
way  of  doing  what  was  done  before.  It  is  not.  It  generally  takes  a 
student  a  little  longer  to  express  his  probabilities  and  the  purchase 
of  a  computer  system  is,  at  present,  not  a  trivial  economic  decision. 
Therefore,  cybernetic  testing  is  going  to  have  to  be  able  to  do  things 
somewhat  better,  either  more  effectively  or  more  cheaply.  So,  let's 
consider  the  gains  that  can  result  from  cybernetic  testing. 

Using  college  students  and  a  pencil-and-paper  test  form  of 
probability  measurement,  Wolt.  Organist  and-I  found  that  multiple-choice 
tests  yielding  split-half  reliabilities  in  the  range  .6  to  .7  for  the 
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number  of  items  correct,  i.e.,  scored  In  the  usual  way  yielded  split- 
half  reliabilities  in  the  vicinity  of  .9  for  total  test  scores  and 
other  measures  obtained  through  probability  measurement.  In  addition, 
theoretical  arguments  can  be  given  which  indicate  that  these  increased 
reliabilities  will  be  found  in  almost  all  testing  situations  encountered 
in  practice.  Therefore,  a  teacher  using  cybernetic  testing  can  reasonably 
expect  to  more  accurately  and  precisely  grade  her  students  and,  of  course, 
since  correlations  and  validities  are  limited  by  test  reliabilities,  she 
can  expect  her  tests  to  give  better  predictions  and  to  have  higher 
validity. 

To  consider  another  result,  first  realize  that  using  cybernetic 
testing  there  is  no  longer  any  need  to  average  over  test  items  or  over 
different  students  since  reliable  information  can  be  obtained  from  each 
individual  query.  There  is,  however,  something  Interesting  that  can  be 
done  by  examining  the  pattern  of  probabilities  given  to  the  answers  of 

one  item  by  all  students  in  a  class.  In  most  cases,  it  can  be  determined 

with  great  precision  both  how  well  the  subject  matter  has  been  taught  and 
how  well  the  test  items  and  answers  have  been  written.  This  cannot  be 
done  with  currently  used  testing  techniques,  but  by  having  a  computer 
examine  the  pattern  of  probabilities,  a  teacher  can  obtain  information 
that  would  enable  her  to  improve  her  teaching  of  the  course  and  the  quality 

of  the  items  that  she  uses  to  test  for  understanding  of  the  subject  matter. 

She  can  also,  of  course,  by  examining  the  pattern  of  probabilities  for 
each  student,  gain  diagnostic  information  useful  in  giving  individual 
attention  to  her  students  and  in  understanding  the  teaching-learning  process. 
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Now,  I  could  proceed  gradually  throu^  many  levels  of  increasing 
sophistication  of  application,  each  level  promising  a  further  Increase 
In  the  effectiveness  of  the  educational  process,  and  finally  arrive  at  the 
level  of  adaptive  programmed  Instruction  with  branching  decisions  based 
on  the  student's  probabilities  rather  than  on  his  choices.  In  fact,  Jim 
Baker  Is  experimenting  with  this  type  of  cybernetic  instruction  at  the 
present  time  and  I  think  that  he  Is  finding  It  quite  exciting.  However, 
due  to  lack  of  time,  I  would  like  to  skip  these  Intermediate  levels  of 
application  and,  instead,  briefly  Introduce  the  notion  of  sequential 
testing  where  the  next  item  to  be  presented  to  the  student  depends  upon 
the  previous  items  and  his  responses  to  these  Items.  Choice  methods 
leave  too  much  airblgulty  about  the  student's  knowledge  in  order  to  be 
used  this  way,  but  the  existence  of  admissible  probability  measurement 
procedures  make  this  type  of  testing  appear  to  be  highly  promising.  The 
promise  resides  in  the  possibility  that  by  utilizing  information  about 
the  structure  of  the  subject  matter  material  and  about  the  way  the  student 
learns,  the  scope  of  a  student's  knowledge  about  a  content  area  can  be 
determined  by  asking  only  a  minimal  nuirbcr  of  questions.  The  test  would 
be  tailored  to  each  student. 

For  example,  in  some  cases  test  items  can  be  written  with  different 
degrees  of  difficulty  so  that  if  a  student  knows  a  particular  item,  he  is 
almost  certain  to  know  the  easier  items.  Thus,  if  a  student  indicates 
almost  complete  certainty  in  the  correct  answer  to  this  particular  item,  a 
much  more  difficult  item  could  be  presented  next  while  if  he  indicated 
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almost  complete  uncertainty,  a  much  easier  Item  could  be  presented  next. 

Such  a  testing  strategy  could  determine  hls  level  of  knowledge  very 
quickly  by  asking  very  few  questions. 

For  another  example  and  In  other  cases.  Items  can  be  written  so  that 
knowledge  of  the  correct  answer  depends  jointly  on  knowledge  of  several 
different,  less  complex  Items.  Proofs  In  mathematics  and  the  translation 
of  sentences  or  phrases  provide  concrete  examples  of  this  type  of  structure. 

In  this  case.  If  a  student  expresses  a  great  deal  of  confidence  In  the  correct 
answer  to  one  of  these  complex  Items,  he  could  then  be  tested  on  a  different 
topic  represented  by  another  complex  Item  while  If  he  expresses  considerable 
uncertainty,  he  could  then  be  tested  on  one  of  the  less'  complex  Items  to 
determine  the  source  of  hls  uncertainty.  This  Is  another  testing  strategy 
which  would,  as  before,  determine  the  scope  of  a  student's  knowledge  with 
great  efficiency. 

The  usefulness  of  sequential  testing  could  be  further  Increased  by 
associating  with  the  questions  at  different  levels  references  to  chapters 
and  to  sections  In  textbooks  and,  where  appropriate,  additional  problems 
and  examples.  This  would  allow  the  diagnostic  Information  provided  by 
sequential  testing  to  be  used  to  recommend  remedial  or  supplementary  study 
for  the  Individual  student  according  to  the  scope  of  hls  knowledge. 

Clearly,  sequential  testing  would  be  a  more  efficient  and  a  more 
enjoyable  form  of  testing.  More  enjoyable  to  the  student,  I  should 
hasten  to  add.  Writing  these  sequential  tests  would  require  much  too 
much  time  of  a  classroom  teacher  operating  under  typical  conditions. 

Therefore,  we  should  expect  that  textbook  pxiblishers  will  make  available 
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sets  of  sequential  test  materials  to  accompany  their  texts  and  that 
test  publishers  will  develop  sequential  tests  for  much  improved  diagnostic 
and  achievement  testing. 

Since  sequential  testing  requires  flexibility  in  the  presentation  of 
items  and  considerable  information  processing,  it  should  be  conducted 
under  the  control  of  a  computer  and  possibly  with  computer-driven  scopes 
and  light  pens.  Tlius,  we  should  expect  tnat  computer  manufacturers  will 
make  available  to  the  schools  completely  pre-programmed  computer  systems 
ready  to  accept  the  sequential  testing  materials  provided  by  the  publishers 
and  to  give  the  tests  to  students  both  for  evaluating  their  progress 
through  the  course  of  instruction  and  as  a  means  of  guiding  their  study 
of  textbooks  and  other  materials. 

Finally,  in  what  other  ways  can  the  combination  of  computers  with 
admissible  probability  measurement  procedures  improve  instruction?  I 
don’t  know,  but  I  do  know  that  our  ability  to  improve  education  depends, 
in  part,  upon  our  knowledge  of  the  teaching-learning  process  v/hich  in 
turn  depends  upon  our  being  able  to  observe  the  effect  of  instructional 
procedures  upon  the  knowledge  structures  of  individual  students.  And 
this  observational  process  is  accomplished  with  exquisite  sensitivity  and 
precision  by  cybernetic  testing. 
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